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Human germ-line V x segments 827 

A directory of human germ-line V x segments 
reveals a strong bias in their usage 

From the genomic DNA of a single individual, we have amplified, cloned and 
sequenced 37 human germ-line V x segments. Four of these segments were new. We 
then compiled a comprehensive directory of all germ-line V x segments and 
identified 50 different sequences with open reading frames. Comparison with 236 
rearranged sequences revealed that no more than 24 of these germ-line sequences 
could be assigned rearranged counterparts, that some of these were rarely used, 
and that only about 11 sequences are used frequently. This suggests that the 
expressed V H repertoire is mainly derived from ajimited number of segments. 
Most surprisingly, the J x -distal region of the locus" appears to be rarely used: we 
could unambiguously assign 162 rearranged sequences to V x segments of the 
J x -proxirhal region, but only 5 to segments of the J x -distal region. 



1 Introduction ., 

Recently antibodies of predefined binding specificity have 
been derived from repertoires of associated heavy and light 
chain variable (V) domains displayed on the surface of 
filamentous phage ([1]; for review see [2])'. The process 
mimics production of antibodies by the immune system and 
bypasses hybridoma technology and immunization [3, 4]. 
Diverse repertoires of V domains have been provided by 
PCR amplification [5] of heavy and light chain V genes [6] 
from populations of lymphocytes. Alternatively the reper- 
toires have been provided by in vitro rearrangement of 
cloned V segments [7]. To provide a bank of cloned Vgene 
segments, we have amplified and sequenced V H and 
segments from genomic DNA, leading to the isolation of 
almost all the functional human V H segments [8] and many 
of the human Vx segments [9]. Here we have attempted to 
gather a large selection of human germ-line V x segments 
using the same strategy. After submission of this report, a 
number of new V x segments were sequenced and mapped 
[10, 11]. For completeness we have included this informa- 
tion in bur revised paper. 

The human immunoglobulin light chain kappa (x) locus is 
located on the short arm of chromosome 2 (2pll-12) [12, 
13] and consists of a Q gene, 5 J x segments and at least 76 
mapped germ-line V x segments [11, 14]. The V x segments 
may be divided by sequence homology [15] into three main 
subgroups, I-HI, and several smaller subgroups (IV, V,VI 
and VII) [11, 16-18] . The portion of the locus harboring the 
majority of V x segments is thought to have arisen from a 
duplication event [19, 20]: 36 segments are located in the 
J x -distal region and 40 in the J x -proximal region [11, 14, 21, 
22]. Segments are clustered in four distinct regions, A, B, L 
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and O (Fig. 1) [18]. The A, L and O clusters are found 
within both the J x -distal and the J x -proximal regions, the B 
cluster only within the J x -proximal region. 

Of the 76 mapped V x segments within the major locus, 57 
have published sequences. Of these 57 segments, 48 have 
alleles with open reading frames (these correspond to 50 
different sequences due to there being some segments from 
the J x -proximal and J x -distal regions which are identical and 
other segments which have multiple alleles) and 9 have 
frame shifts, stop codons or incomplete exons and are, 
therefore, regarded as pseudogenes. Most of the remaining 
19 segments are also known to be pseudogenes [10, 16, 17, 
19, 20, 23-36] (Fig. 1). There are also a number of orphon 
V x segments which have been mapped outside the major x 
locus: Wl-Wll [37, 38],V108 [39], Chr22-1 to Chr22-5 [40, 
41], Chrl [41], cos 118 [41] and Z1-Z4 [42, 43]. There are 
two further segments with open reading frames that have 
not been mapped (LFVK5 and LFVK431; L. Foroni, 
personal communication). 

To clone the functional V x segments, we assembled a 
database of published germ-line sequences and then 
designed PCR primers to amplify each of the different 
subgroups. A germ-line V x segment consists (5' to 3') of a 
leader sequence (L), a leader intron, a continuation of the 
leader sequence (L') and the V x exon (Fig. 2). Within the 
exon there are three framework regions (FR) and three 
complementarity-determining regions (CDR). Adjoining 
the 3' end of the exon is a highly conserved heptamer 
recombination signal and a less conserved nonamer region, 
the two separated by 11-14 nucleotides. Foward PCR 
primers were based in the heptamer region and back 
primers on the overlap between the leader intron and L' 
region (Fig. 2 and Table 1). Using these primers we ampli- 
fied, cloned and sequenced germ-line V x segments from the 
genomic DNA extracted from the peripheral white blood 
cells of a single donor (DP). 



2 Materials and methods 
2.1 Primer design 

Forward and back primers were designed for the V X L II, III, 
IV and VI subgroups from alignments of published germ- 
line sequences. For the V X I subgroup, three cluster-specific 
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Figure 1. Schematic representation of the human immunoglobulin x locus (after [11, 21]). Germ-line V x segments with open reading 
frames are represented by circles. The segments A4, A5, A18, A22, A28, L3, L7, L17 and L21 have frame shifts, stop codons or 
incomplete V H exons and are, therefore, regarded as pseudogenes (squares). The remaining 19 segments have either not been sequenced 
(triangle) or are known to be pseudogenes (squares). The segments L4 and L16 have a number of alleles, some with open reading frames 
and others with stop codons. Where a segment has been seen rearranged in vivo, the circle has been filled (black or shaded). If the 
rearranged sequences could not be unambigously assigned to segments of the J x -proximal or J x -distal regions, the corresponding segments 
from both regions have been shaded. Sequences from this study have been attributed their DPK codes (see text). . t t ; 
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Table 1. Primers used for PCR amplification of the V K exon 



Primer 



Sequence (5' to 3') 



Vjcf VK1(FA) 
VK1CFL) 
VK1CFO) 
VK1CBA1) 
VK1CBLO) 

Vjfl vk2(falx) 
vk2(fa2x) 
VK2(BOA) 



GCT CTA G AC GGG CTT GTA TCA CAG TG 

GCT CTA GAG TT(CT) (AG)GG T(GT)(GT) GTA ACA CT > 

GCT CTA GAA TGCAC) CTT GT(TA) ACA CTG TG 

CCC CCA AGC TTT GTT CCT AAT ATC AG A TA 

CCC CCA AGC TTA ATC (TG)CA GGT (GT)CC AGA TG 

GAG GTT TTC TAG A(TG)G (GA)(GT)(CT) TGT A(GC)C ACT GTG 
GAG GTT TTC TAG AAG (GA)(GT)(CT) TGT A(GC)C ACT GTG 
CCC CCA AGC TT<TA) A(TC)T TCA GGA TCC AGT G 



\ 



Vjjn VK3(R) 
' VK3(LA) 



GGA ATT CT(CT) A(TA)G (CDTG AAT CAC TGT G 
CCC CCA AGC TTT CCA AT(TC) T(CT)(AG) GAT ACC AC 



VJV VK4HEFT 
VK4LEA 



GCT CTA GAC GAG GCT GAA GCA CTG TG 
CCC CCA AGC TTA CTA CAG GTG CCT ACG GG 



V K VI VK6 HEPTI 
VK6LEA 



GCT CTA GAG GGT TGT A(GA)C ACA GTG TG 
CCC CCA AGC TTT TTT CAG CCT CCA GGG GT 
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' - , - . ■ . ■ _ , , ■ * • 
forward primers were designed, VKl(FA), VKl(FL), and 65 °C for 5 min). Sequencing was performed using an 
VKl(FO), corresponding to ;the A, L and O clusters, M13-specific primer, Tkq polymerase and fluorescent 
respectively^ Two cluster-specific back primers, VKl(BAl) dideoxy chain terminators [45]. The sequences were ana- 
and VK(BLO), corresponding to the A cluster and the lyzed on an Applied Biosystems 373A Automated DNA 
combined B. and L \ clusters .were also used. Hindlll Sequencer (Foster City, CA). 
restriction sites were incorporated into the 5' end of all back 
primers. All forward primers had Xbal restriction sites at 
the 5' end, except VK3(F1) which had an EcoRl site and 
the y x II primers, where the site was introduced into 3 Results 
the middle, of the primer to replace a particularly degener- 
ate section. The primers were used in pairs as indicated in Amplification using the subgroup-specific primers with 
Fig;.-2.' • - ; ' • ' . ' ' " : suitable adjustment of the annealing temperature gave 
" ; i\. r .. . '. !. : ' / ' I ■ r ; , 4 single PCR bands of varying intensities. Primer pairs were 

■ ',1, ■ ' • S~ .'. ^ . , found to be specific for their respective subgroups. We 
2.2 PCR amplification and sequencing ' sequenced 142cl.ones (62 V H I,60 VJI, 14 VJII, 1 VJVand5 

■ ^u-. ■ ;; -r^ -i ;~r;\ -v- ^ - .■• V X VI). From these, 37 V H sequences were identified. 
PCR amplification was performed according to [8] with the Excluding those identical to mapped orphon segments (see 
following modificatibn^ Products from the PCR amplifica- below) 27 segments had open reading frames (DPK1- 
tions and restriction enzyme digests, were purified using DPK27): 24 were identical to published sequences and 3 
Magic™ PCR Preps (Promega, Madison, WI)! Inserts from segments, DPK2, DPK14 and DPK23, were new. DPK2 was 
plaques picked from a TYE [44] plate.were amplified with most similar to L1/HK137 (14 nucleotide changes), DPK14 
an M13mpl9-specific primer pair using the PCR (25 cycles, to the pseudogene A5 (insertion of one nucleotide and 3 
each cycle corisistihg of 1 min at 94 °C, 1 miri at 55 °C, 30 s at changes), and DPK23 to LlO/Vh (8 nucleotide changes). 
72 °C; at the end of 25 cycles there was a final, extension at The remaining segments are identical to the pseudogenes 



Table 2. Assignment of rearranged V K sequences to their closest germane counterparts 

> Germline sequence; / • ,,. ' No. of rearranged sequences 8 Closest rearranged sequence A(N, P)° 

Vj ' 08/018/DPK1 21 ' • " (0,0) 

'' : " A30 -\ l ;' • ;•• t ] ' 3 "■ (5,2) 

U/HK137 .. . r ' V.\i . . " ,(15,8) 

]' r: ;~/'A20{D!n^ '.-r- Ai -.9 I'^'.S ' . ' (0,0) 

. I^/L^VWVb'A^b/DPKS^ . 4 : (3,0) 

L8/Vd/DPK8 : 10 (1,0) 

"•' 19/Ve • / ' ;" _ ? 3 . - • (0,0) . 

L12VHK102/V1 ' " ' ! ' ' (i J ; \ ;\ (37,17) 

Li22 : " ; ; ;/...;; ;V':.".i2,-, .; ' (5,d 

012VV3^-,/-^;;;,',^'"L .-. (2, D 

: 02/012 2 /DPK9 30 (0,0) 

04/OU/LFVK19H/DPK11 >< '1 - ' (11,6) . 

Vxff A2/DPK12 • . , ; \ 1 (9,8) 

A3/A19/DPK15 11 (0,0) 



. A17/DPK18, . ' v— ^ ,11 , (1,0) 

1_ 
24 



A1/DPK19 - 1 (0,0) 



Vjfll All/humkv305/DPK20 , r . ~. v . 1 (3,2) 

L16/humkv328/humkv328h2 2 (3, 3) 

L2/humkv328h5/DPK21 20 (1,0) 

t , . A27/hurnkv325/VKRF/DPiC^ . . ? 48. (0,0) 

,. V: L6/Vg . : *\\»r . 19 , ' . ■■ t . ' (0,0) 

■•>.:; • * ; ; ' 90 

VxfV::B3/VkIV/DPK24 • 1 • 22 (1,1) 

V K V' B2/EV15 - ' ' ' - 2 (0,0) 

V K Vl A10/A26/DPK26 ' ' " 2 (3,1) 



a) Number of rearranged sequences assigned to their closest germline counterpart. 

b) A(N, P) denotes the number of nucleotide (N) and amino acid residue changes (P). 
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' GvPDRFSGSGSGTDFlTiKISRVEAEIMArrrYC 
GWPDRFSGSGSGSDFTLKISWyEAEpVGvYYG 
GVPTJRFSGSGSGSDFTIiQiSnVEAEDvCvYYC 
GVPDRFSGSGAGTDFTLKISIWEAEDvC^^ 
GVmRFSGSGMjTOFTIJaSRVEAEuWm 
GVPDRFSGSGSGTDFTIiKISSVEAEDM3vYYC 
GVPTJRFSGSGSGTDFTLKISiWEAEDIvGvY^ 

GTJ^RFSGSGSGTDFTLTISRLEPEDFAVYTC 
GTJARFSGSGPGTDFTLTISSLEPEDFAVYYC 
GTPARFSGSGSGTEFTLTISSIQSEDFAvYYC 
GoPARFSGSGSGTEFTLTISSLOSEDFAVYYC 
GIPDRFSGSGSGTDFTLTISRLEPEDFAVYYC 
GIPAlffSGSGSGTDFTLTISSLQPpFAVYYC 
GIPAIVSGSGSG1TJFTLTISSLQPEDFAVYYC 
S1TARFSGSGSGTDFTLTISSLQPEDFAVYYC 
GIPARFSCSGSGIDFTLTISSLEPEDFAVYYC 



• i 

QQYDNLP 
LQKNSYP 
. QQYNSYP 
.LQDYNYP 
LQHNSYP 
"bQSDSTP 
'QQYNSYP 
v QQYNSYP 
. QKYNSAP 
QQFNNYP 
QQFNSYP 
. QQANSFP 
^QQANSFP 
QQYNSYP 
QQLNSYP 
QQYYSYP 
QQYNSYS 
QQYNSYS 
; 'QCGYSTP 
QQSYSTP 
QQYYSFP 
QBTYNAP 
; KGPFSYP : 
QQYYSTP 

MQSIQLP 
.MQRIEFP 

"mjriefp 

EQGLQGP 
M23AQDP 
'M3AIQTP 
MQATQFP 
MQATQFP 
MQATQFP 
TQATQFP 
MQGTHWP 
NQGTHWP t 

QQYGSSP 
. QQRSNWH 
QQYNNWP 
QQYNNWP 
QQYGSSP 
QQDYNLP 
QQDYNLP 
QQDKNIf 
QQRSNWP 



Figure 3. Amino acid sequences of germ-line V x segments with open reading frames. Where possible, sequences have been assigned to 
their respective loci (bold type). Where two alleles have different nucleotide sequences these are followed by the superscripts 1 or 2 : Amino 
acid sequences from this study are labeled DPK1-DPK27 (see text). References for other sequences are in brackets, [*] indicates L. 
Foroni, personal communication. Amino acids are shown in single-letter codes. Sequences are arranged alphabetically within each 
subgroup according to the sequence of CDR1 .Where two CDR1 sequences are identical, the order is based similarly upon the sequence of 
CDR2 or CDR3. The hypervariable regions, CDR1, CDR2 and CDR3 as defined by Kabat et al. [15] are labeled,' as are , the 
antigen-binding loops, LI, L2 and L3 as defined by Chothia and Lesk [50]. Numbering of amino acid residues is according to [15] except in 
CDR1 where numbering is according to [50]. The length of the LI loop [50] is shown in italics. Germ-line sequences with known 
rearranged counterparts have been labeled RJV, where N represents the number of amino acid differences between the germ-line V x 
sequence and its closest rearranged counterpart (see Table 2). Note that L5/L19 1 AO)AO}W4b/DPK5 and L19 2 /Vb7DPK6; and 
L16/humkv328/humkv328h2 and L2/humkv328h5/DPK21 have identical protein sequences but different nucleotide sequences (see 
Fig- 4). . . . ; . . ..... 
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A4 [25], A18 [19], A22 [24], L4 (Va) [35], HK100 [31] and 
to the orphon segments W2 [38], W8/W10 [38], Z2 [43], 
chr22-4 [41]: DPK37 is new and is most similar to the 
orphon Z3 [41] (deletion of one nucleotide and three 
changes). The sequences of the four new segments, DPK2, 
DPK14, DPK23 and DPK37, were confirmed with clones 
from independent PCR amplifications. Subsequently, two 
of these segments, DPK2 and DPK23, proved to be 
identical to the sequences L14 and L25, which were 
published after the submission of this paper [10]. The fact 
that 35 of the 37 segments amplified from DP are identical 
to published sequences confirms that germ-line V x sequence 
polymorphism is rather limited [46]. 

In addition to the confirmed sequences, many clones had 
sequences which were not observed in independent ampli- 
fications. Some had one or two nucleotide differences from 
the V x segments DPK1-DPK37 and were probably due to 
PCR errors. The 21 nucleotide substitutions in the 52 V H I 
sequences correspond to 5 x 10~ 5 changes per nucleotide 
per PCR cycle and are consistent with the error rate for Taq 
polymerase [47]. Other clones appeared to be due to PCR 
cross-over [8, 48]. 

We have compiled a comprehensive directory of the amino 
acid and nucleotide sequences of all germ-line V x segments 
with open reading frames (Figs. 1, 3 and 4) .Where possible, 
sequences have been assigned to their respective loci. We 
have compared these sequences with those of 113 rear- 
ranged V x sequences taken from the Genbank/EMBL 
nucleotide databases and 123 additional sequences from 
the literature (our rearranged database is available on 
request). All rearranged sequences were assigned to their 
closest germ-line counterparts. In Table 2 we note the 
closest sequence identities and the number of rearranged 
sequences assigned to each germ-line sequence. In a few 
examples, the rearranged sequence was found to be a 
composite of two V x segments, presumably arising by PCR 
cross-over during cloning. For example, V X I-X14 [49] is 
identical to 04/014/LFVK19H/DPK11 over the first 220 
nucleotides arid virtually identical to 02/012 2 /DPK9 over 
the last 100 nucleotides, due to a cross-over in FR3. 

Only 24 of the 50 germ-line sequences in our directory were 
found to have rearranged counterparts. Although 162 
rearranged sequences correspond to mapped V x segments 
from the J x -proximal region of the kappa locus only 5 are 
derived from the J x -distal region. These correspond to the 
germ-line sequences A2/DPK12, A1/DPK19, All/humkv- 
305/DPK20 and L16/humkv328/humkv328h2 (see Fig. 1). 
Of the rearranged sequences, 69 correspond to 08/018/- 
DPK1, L5/L19/VbWV/V4b/DPK5, 02/012 2 /DPK9, 04/- 
014/LFVK19H, A3/A19/DPK15 and A10/A26/DPK26: in 
these cases each rearranged sequence could have been 
derived from V x segments from either the J x -proximal or 
J x -distal region. No rearranged sequences correspond to 
trie unmapped V x segments LFVK5, LFVK431 and 
DPK14. 

The distribution of the number of amino acid differences 
across all 236 assigned rearranged sequences is shown in 
Fig. 5. This includes sequences of antibodies with a wide 
range of specificities. As sequence polymorphism at the 
germ-line level is limited (see above) the majority of 
changes are probably due to somatic mutation. 



4 Discussion 

4.1 Functional V x segments 

The map of the V x locus (Fig. 1) contains 76 segments. It is 
estimated, on the basis of hybridization experiments, that 
4-6 V x segments have yet to be mapped [11, 14]. The V x 
segments LFVK5, LFVK431 and DPK14 could, therefore, 
correspond to the unmapped loci or could be allelic variants 
of mapped V x segments. 

From our sequence directory (Fig. 3 and 4) and from the 
assignment of 236 rearranged sequences to their closest 
germ-line counterparts (Table 2) we find that no more than 
24 of the 50 germ-line sequences have rearranged counter- 
parts, suggesting that the expressed V x repertoire is mainly 
derived from these segments. Indeed for 7 of these V x 
sequences we could find only a single example of a 
rearrangement, and in 4 cases (L1/HK137, L12VHK102/ 
VI, 04/014/LFVK19H/DPK11 and A2/DPK11) the 
assignment is tentative in view of the number of sequence 
differences. It is also possible that some of the remaining 
segments with open reading frames are functional, but are 
only rarely used. Others may well prove to have frame shifts 
or stop codons in the leader exon, defective recombination 
signals or splice sites or other cw-acting defects which 
render them non functional. A14/DPK25, for example, has 
an altered heptamer which may prevent its rearrangement. 
We have made similar observations for human germ-line Vh 
segments where only 49/83 germ-line sequences with open 
reading frames were seen as rearranged genes [8]. ( 

4.2 V x segmemt usage 

The assignment of rearranged V x genes to their germ-line 
counterparts (Table 2) confirms that segments from all four 
clusters (A, B, L and O) and from six subgroups (I-VI) are 
used. Segments from the V X I and V X HI subgroups are used 
most frequently (96 and 90 rearranged sequences respec- 
tively); those from the V X II and V X IV : subgroups less 
frequently (24 and 22 respectively); and those from the V X V 
and V X VI subgroups rarely (2 rearranged sequences each). 
The only V X VTE segment, Bl, has no rearranged counter- 
part in our database. As noted above, some of the V x 




Number of amino acid differences 



Figure 5. Distribution of the number of amino acid differences 
between each sequence in a database of rearranged V x sequences 
(236 in total) and its closest germ-line counterpart in our directory. 
The database of rearranged V x sequences is available on disk on 
request. ■ 
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'/ ^ 

I segments are rarely used, and- of the 24 germ-line 

| sequences, only about 11 appear to be frequently used 

| (Table 2). r \\', : - ; . ' 

i Of the rearranged sequences, 162 correspond to V x seg- 

I ments from the J x -proximal, region of the kappa locus, 

5 whereas only 5 appear to be derived from the J x -distal 

I region. Indeed, one of these prearranged sequences from 

1 the J x -distal region, EVJK11, which corresponds to 

■ j All/humkv305/DPK20, is the product of aberrant recom- 

I bination [30] . An additional 69 rearranged sequences could 

| conceivably be derived from V x segments in either the 

I proximal or distal regions . However; as has been previously 
| noted, the segments Q2 (distal) and 012 (proximal) differ 
J by one nucleotide in a region which flanks the exon [29]: 

1 rearranged sequences' which include this region appear to 
be derived from 012 rather than 02 [29]. Although there is 
a clear bias towards use of the J x -proximal V x segments, we 

| can detect no bias within the J x -proximal region. 

II The reason for the remarkable bias towards use of V x 
segments from the J x -proximal region is unclear. It may be 
due to the relative distances of the proximal and distal 
regions from the J x segments or the different recombination 
mechanisms necessary to produce the rearranged V x gene: 
most V x segments in the proximal region rearrange by 
deletion, whereas those in the distal region must rearrange 
by inversion. We note that lack of the J x -distal copy of the 
locus (haplo type 11) does not appear to be deleterious 
[21];:- -t , - . . * ' ■ 

The assignment of rearranged V x genes may help in 
dissecting the mechanisms of the human immune system. 
For example, more than half of the rearranged sequences in 
our database have three or fewer residue changes (Fig. 5). 
Antibodies from patients with chronic lymphocytic leu- 
kemia (CLL) and X-linked agammaglobulinemia (XLA) 
are rarely mutated or unmutated. More highly mutated V x 
genes tend to be seen in myelomas and in antibodies which 
are subject to an antigen-driven response. In the case of 
autoantibodies, the level of somatic mutation appears to be 
normal. ^ - ! 



4.3 Structures of loops implicit in the human V x 
segments 

The antigen-binding loops of immunoglobulin variable 
domains have ( been shown to adopt a limited number of 
main chain conformations or "canonical structures" [50, 
51]. The structure of each loop* depends on its length and the 
identity of certain key residues involved in its packing, and 
using this information it is sometimes possible to predict the 
structure of the loops from the sequence of the V x domain 
[51]. Here we have attempted to identify the loop structures 
of those human V x segments that appear to undergo 
rearrangement (Table 2). 

Across different species, the V x antigen binding site loop LI 
(residues 26-32, corresponding to CDR1, see Fig. 3) can 
have lengths of 6, 7, 8, 11, 12 and 13 residues and 
presumably can form at least six major conformations [50, 
51]. The structures of antibodies with loops of 6,7, 12 and 13 
residues have been solved crystailographically: in each case 
residue 29 in the loop is buried in the P-sheet framework, 
and packs against residues 2, 25, 33 and 71 [50, 51]. These 
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packing contacts are often very similar: Ile/Val 2, Ala/Ser 
25,Val/Ile/Leu 29, Leu/Met 33 and Tyr/Phe 71 [50]. 

However, for those human V x segments with rearranged 
counterparts (Table 2), we only see LI loops of 7, 8, 12 and 
13 residues (Fig. 3). In all cases the packing contacts are 
highly conserved, suggesting that the segments should 
encode four major conformations of the LI loop. Of the 
germ-line sequences, 17 (141 rearranged sequences) 
encode a 7-residue LI loop, 2 sequences (49 rearranged 
sequences of which 48 use the V x in segment A27) encode 
an 8-residue loop, 4 sequences (24 rearranged sequences) 
encode a 12-residue loop and 1 sequence (22 rearranged 
sequences) encodes a 13-residue LI loop. LI lengths of 6 
residues have only been seen in mice [15], suggesting that 
mouse V K segments encode structures which cannot be 
encoded by human V x segments. 

The L2 loop (residues 50-52, corresponding to CDR2, see 
Fig. 3) is only three residues long in all structures, and 
undergoes packing interactions with framework residues 48 
and 64 [50, 51]. All germ-line V x sequences with rearranged 
counterparts (Fig. 3 and Table 2) have He 48 and Gly 64, 
indicating that they are likely to encode a single loop 
conformation. 

The L3 loop (residues 91-96, corresponding to CDR3, see 
Fig. 3) is encoded mainly by the V x segment, and is most 
often a six-residue loop with Gin, Asn or His at 90 and Pro 
at 95 [50, 51]. Almost all human germ-line V x sequences 
with rearranged counterparts encode Gin 90 and Pro 95, 
except A20/DPK4 (Lys 90), 04/014/LFVK19H/DPK11 
(Arg 90), L12VHK102/V1 and L12 2 (Ser 95). This indicates 
that most human V x germ-line segments encode a single 
conformation of this loop: but depending on the location of 
the V-J join and nucleotide addition, other conformations 
may be formed in the rearranged gene. 

We note that several of the germ-line sequences that do not 
have rearranged counterparts in our database (Fig. 3 and 
Table 2) have atypical residues at the key residues involved 
in the packing of the antigen-binding loops (LI loop: L22 
[Leu 48]; LFVK5 and 011 2 /V3a [Asp 64]; L3 loop: L20/Vg" 
[His 95]) or unusual loop lengths (Bl has an LI of 11 
residues). 

In combination, the three antigen-binding loops LI , L2 and 
L3 of human germ-line V x segments are likely to encode 
four major folds. Since our repertoire of cloned human 
germ-line V x segments includes 15 of the 24 germ-line V x 
sequences with rearranged counterparts and examples of 
each of the four major folds it should be a valuable resource 
for building synthetic antibodies for use in phage display 
libraries. 
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